A Machine Learning Approach for Phenotype Name Recognition

نویسندگان

  • Maryam Khordad
  • Robert E. Mercer
  • Peter K. Rogan
چکیده

Extracting biomedical named entities is one of the major challenges in automatic processing of biomedical literature. This paper proposes a machine learning approach for finding phenotype names in text. Features are included in a machine learning infrastructure to implement the rules found in our previously developed rule-based system. The system also uses two available resources: MetaMap and HPO. As we are not aware of any available corpus for phenotype names, a corpus has been constructed. Since manual tagging of the corpus was not possible for us, we started tagging only HPO phenotypes in the corpus and then using a semi-supervised learning method, the tagging process improved. The evaluation results (F-Score 92.25) suggest that the system achieved good performance and it outperforms the rule-based system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

A hybrid EEG-based emotion recognition approach using Wavelet Convolutional Neural Networks (WCNN) and support vector machine

Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wave...

متن کامل

Corefrence resolution with deep learning in the Persian Labnguage

Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...

متن کامل

Emotion Detection in Persian Text; A Machine Learning Model

This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...

متن کامل

A Hybrid Statistical Approach for Named Entity Recognition for Malayalam Language

Named-Entity Recognition (NER) plays a significant role in classifying or locating atomic elements in text into predefined categories such as the name of persons, organizations, locations, expression of times, quantities, monetary values, temporal expressions and percentages. Several Statistical methods with supervised and unsupervised learning have applied English and some other Indian languag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012